[pull] master from torvalds:master#73
Merged
pull[bot] merged 282 commits intolinux-doc:masterfrom Oct 12, 2020
Merged
Conversation
Early Intel hardware implementations of Memory Bandwidth Allocation (MBA) could only control bandwidth at the processor core level. This meant that when two processes with different bandwidth allocations ran simultaneously on the same core the hardware had to resolve this difference. It did so by applying the higher throttling value (lower bandwidth) to both processes. Newer implementations can apply different throttling values to each thread on a core. Introduce a new resctrl file, "thread_throttle_mode", on Intel systems that shows to the user how throttling values are allocated, per-core or per-thread. On systems that support per-core throttling, the file will display "max". On newer systems that support per-thread throttling, the file will display "per-thread". AMD confirmed in [1] that AMD bandwidth allocation is already at thread level but that the AMD implementation does not use a memory delay throttle mode. So to avoid confusion the thread throttling mode would be UNDEFINED on AMD systems and the "thread_throttle_mode" file will not be visible. Originally-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/1598296281-127595-3-git-send-email-fenghua.yu@intel.com Link: [1] https://lore.kernel.org/lkml/18d277fd-6523-319c-d560-66b63ff606b8@amd.com
A long time ago, Linux cleared IA32_MCG_STATUS at the very end of machine check processing. Then, some fancy recovery and IST manipulation was added in: d4812e1 ("x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks") and clearing IA32_MCG_STATUS was pulled earlier in the function. Next change moved the actual recovery out of do_machine_check() and just used task_work_add() to schedule it later (before returning to the user): 5567d11 ("x86/mce: Send #MC singal from task work") Most recently the fancy IST footwork was removed as no longer needed: b052df3 ("x86/entry: Get rid of ist_begin/end_non_atomic()") At this point there is no reason remaining to clear IA32_MCG_STATUS early. It can move back to the very end of the function. Also move sync_core(). The comments for this function say that it should only be called when instructions have been changed/re-mapped. Recovery for an instruction fetch may change the physical address. But that doesn't happen until the scheduled work runs (which could be on another CPU). [ bp: Massage commit message. ] Reported-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200824221237.5397-1-tony.luck@intel.com
The ptrace() test forgot to reap its child. Reap it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/e7700a503f30e79ab35a63103938a19893dbeff2.1598461151.git.luto@kernel.org
…LDT GS This tests commit: 8ab4952 ("x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task") Unpatched kernels will OOPS. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/c618ae86d1f757e01b1a8e79869f553cb88acf9a.1598461151.git.luto@kernel.org
Remove asm/io_apic.h which is included more than once. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Pekka Enberg <penberg@kernel.org> Link: https://lkml.kernel.org/r/20200819112910.7629-1-wanghai38@huawei.com
…ions Intel TSX suspend load tracking instructions aim to give a way to choose which memory accesses do not need to be tracked in the TSX read set. Add TSX suspend load tracking CPUID feature flag TSXLDTRK for enumeration. A processor supports Intel TSX suspend load address tracking if CPUID.0x07.0x0:EDX[16] is present. Two instructions XSUSLDTRK, XRESLDTRK are available when this feature is present. The CPU feature flag is shown as "tsxldtrk" in /proc/cpuinfo. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1598316478-23337-2-git-send-email-cathy.zhang@intel.com
TSX suspend load tracking instruction is supported by the Intel uarch Sapphire Rapids. It aims to give a way to choose which memory accesses do not need to be tracked in the TSX read set. It's availability is indicated as CPUID.(EAX=7,ECX=0):EDX[bit 16]. Expose TSX Suspend Load Address Tracking feature in KVM CPUID, so KVM could pass this information to guests and they can make use of this feature accordingly. Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1598316478-23337-3-git-send-email-cathy.zhang@intel.com
/proc/cpuinfo shows features which the kernel supports. Some of these flags are derived from CPUID, and others are derived from other sources, including some that are entirely software-based. Currently, there is not any documentation in the kernel about how /proc/cpuinfo flags are generated and what it means when they are missing. Add documentation for /proc/cpuinfo feature flags enumeration. Document how and when x86 feature flags are used. Also discuss what their presence or absence mean for the kernel and users. Suggested-by: Dave Hansen <dave.hansen@intel.com> Co-developed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200831183500.15481-1-kyung.min.park@intel.com
Fix build warning since this file is already listed in include/asm-generic/Kbuild. ../scripts/Makefile.asm-generic:25: redundant generic-y found in arch/microblaze/include/asm/Kbuild: hw_irq.h Fixes: 630f289 ("asm-generic: make more kernel-space headers mandatory") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/4d992aee-8a69-1769-e622-8d6d6e316346@infradead.org Signed-off-by: Michal Simek <michal.simek@xilinx.com>
When pci_get_device_func() fails, the driver doesn't need to execute pci_dev_put(). mci should still be freed, though, to prevent a memory leak. When pci_enable_device() fails, the error injection PCI device "einj" doesn't need to be disabled either. [ bp: Massage commit message, rename label to "bail_mc_free". ] Fixes: 52608ba ("i5100_edac: probe for device 19 function 0") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200826121437.31606-1-dinghao.liu@zju.edu.cn
Most of the compat vDSO code can be built and guarded using IS_ENABLED, so drop the unnecessary #ifdefs. Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
There's really no need to put every parameter on a new line when calling a function with a long name, so reformat the *setup_additional_pages() functions in the vDSO setup code to follow the usual conventions. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
platform_get_irq() returns a negative error number on error. In such a case, comparison to 0 would pass the check therefore check the return value properly, whether it is negative. [ bp: Massage commit message. ] Fixes: 9b7e624 ("EDAC, aspeed: Add an Aspeed AST2500 EDAC driver") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Stefan Schaeckeler <schaecsn@gmx.net> Link: https://lkml.kernel.org/r/20200827070743.26628-1-krzk@kernel.org
platform_get_irq() returns a negative error number on error. In such a case, comparison to 0 would pass the check therefore check the return value properly, whether it is negative. [ bp: Massage commit message. ] Fixes: 86a18ee ("EDAC, ti: Add support for TI keystone and DRA7xx EDAC") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tero Kristo <t-kristo@ti.com> Link: https://lkml.kernel.org/r/20200827070743.26628-2-krzk@kernel.org
Add Memory Tagging Extension system register definitions together with the relevant bitfields. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
Once user space is given access to tagged memory, the kernel must be able to clear/save/restore tags visible to the user. This is done via the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED index for MAIR_EL1 is initially mapped as Normal memory and later changed to Normal Tagged via the cpufeature infrastructure. From a mismatched attribute aliases perspective, the Tagged memory is considered a permission and it won't lead to undefined behaviour. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
Add the cpufeature and hwcap entries to detect the presence of MTE. Any secondary CPU not supporting the feature, if detected on the boot CPU, will be parked. Add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling MTE. The Normal Tagged memory type is configured in MAIR_EL1 before the MMU is enabled in order to avoid disrupting other CPUs in the CnP domain. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
inst.h was included in calling.h solely to instantiate the RDPID macro. The usage of RDPID was removed in 6a3ea3e ("x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM") so remove the include. Fixes: 6a3ea3e ("x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200827171735.93825-1-ubizjak@gmail.com
KVM does not support MTE in guests yet, so clear the corresponding field in the ID_AA64PFR1_EL1 register. In addition, inject an undefined exception in the guest if it accesses one of the GCR_EL1, RGSR_EL1, TFSR_EL1 or TFSRE0_EL1 registers. While the emulate_sys_reg() function already injects an undefined exception, this patch prevents the unnecessary printk. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Price <steven.price@arm.com> Acked-by: Marc Zyngier <maz@kernel.org>
Add MTE-specific SIGSEGV codes to siginfo.h and update the x86 BUILD_BUG_ON(NSIGSEGV != 7) compile check. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> [catalin.marinas@arm.com: renamed precise/imprecise to sync/async] [catalin.marinas@arm.com: dropped #ifdef __aarch64__, renumbered] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Will Deacon <will@kernel.org>
The Memory Tagging Extension has two modes of notifying a tag check fault at EL0, configurable through the SCTLR_EL1.TCF0 field: 1. Synchronous raising of a Data Abort exception with DFSC 17. 2. Asynchronous setting of a cumulative bit in TFSRE0_EL1. Add the exception handler for the synchronous exception and handling of the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in do_notify_resume(). On a tag check failure in user-space, whether synchronous or asynchronous, a SIGSEGV will be raised on the faulting thread. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
For arm64 MTE support it is necessary to be able to mark pages that contain user space visible tags that will need to be saved/restored e.g. when swapped out. To support this add a new arch specific flag (PG_arch_2). This flag is only available on 64-bit architectures due to the limited number of spare page flags on the 32-bit ones. Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: use CONFIG_64BIT for guarding this new flag] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
When a huge page is split into normal pages, part of the head page flags are transferred to the tail pages. However, the PG_arch_* flags are not part of the preserved set. PG_arch_2 is used by the arm64 MTE support to mark pages that have valid tags. The absence of such flag would cause the arm64 set_pte_at() to clear the tags in order to avoid stale tags exposed to user or the swapping out hooks to ignore the tags. Not preserving PG_arch_2 on huge page splitting leads to tag corruption in the tail pages. Preserve the newly added PG_arch_2 flag in __split_huge_page_tail(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
…ROT_MTE Pages allocated by the kernel are not guaranteed to have the tags zeroed, especially as the kernel does not (yet) use MTE itself. To ensure the user can still access such pages when mapped into its address space, clear the tags via set_pte_at(). A new page flag - PG_mte_tagged (PG_arch_2) - is used to track pages with valid allocation tags. Since the zero page is mapped as pte_special(), it won't be covered by the above set_pte_at() mechanism. Clear its tags during early MTE initialisation. Co-developed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
When the Memory Tagging Extension is enabled, the tags need to be
preserved across page copy (e.g. for copy-on-write, page migration).
Introduce MTE-aware copy_{user_,}highpage() functions to copy tags to
the destination if the source page has the PG_mte_tagged flag set.
copy_user_page() does not need to handle tag copying since, with this
patch, it is only called by the DAX code where there is no source page
structure (and no source tags).
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Since clear_user_page() calls clear_page() directly, avoid the unnecessary indirection. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
When the Memory Tagging Extension is enabled, two pages are identical only if both their data and tags are identical. Make the generic memcmp_pages() a __weak function and add an arm64-specific implementation which returns non-zero if any of the two pages contain valid MTE tags (PG_mte_tagged set). There isn't much benefit in comparing the tags of two pages since these are normally used for heap allocations and likely to differ anyway. Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
Similarly to arch_calc_vm_prot_bits(), introduce a dummy arch_calc_vm_flag_bits() invoked from calc_vm_flag_bits(). This macro can be overridden by architectures to insert specific VM_* flags derived from the mmap() MAP_* flags. Signed-off-by: Kevin Brodsky <Kevin.Brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
To enable tagging on a memory range, the user must explicitly opt in via a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new memory type in the AttrIndx field of a pte, simplify the or'ing of these bits over the protection_map[] attributes by making MT_NORMAL index 0. There are two conditions for arch_vm_get_page_prot() to return the MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE, registered as VM_MTE in the vm_flags, and (2) the vma supports MTE, decided during the mmap() call (only) and registered as VM_MTE_ALLOWED. arch_calc_vm_prot_bits() is responsible for registering the user request as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its mmap() file ops call. In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on stack or brk area. The Linux mmap() syscall currently ignores unknown PROT_* flags. In the presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE will not report an error and the memory will not be mapped as Normal Tagged. For consistency, mprotect(PROT_MTE) will not report an error either if the memory range does not support MTE. Two subsequent patches in the series will propose tightening of this behaviour. Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
Similarly to arch_validate_prot() called from do_mprotect_pkey(), an architecture may need to sanity-check the new vm_flags. Define a dummy function always returning true. In addition to do_mprotect_pkey(), also invoke it from mmap_region() prior to updating vma->vm_page_prot to allow the architecture code to veto potentially inconsistent vm_flags. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>
Commit 9bceb80 ("arm64: kaslr: Use standard early random function") removed the direct calls of the __arm64_rndr() and __early_cpu_has_rndr() functions, but left the dummy prototypes in the #else branch of the #ifdef CONFIG_ARCH_RANDOM guard. Remove the redundant prototypes, as they have no users outside of this header file. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20201006194453.36519-1-andre.przywara@arm.com Signed-off-by: Will Deacon <will@kernel.org>
Late patches for 5.10: MTE selftests, minor KCSAN preparation and removal of some unused prototypes. (Amit Daniel Kachhap and others) * for-next/late-arrivals: arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory
Carve out the MOVDIR64B inline asm primitive into a generic helper so that it can be used by other functions. Move it to special_insns.h and have iosubmit_cmds512() call it. [ bp: Massage commit message. ] Suggested-by: Michael Matz <matz@suse.de> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201005151126.657029-2-dave.jiang@intel.com
Currently, the MOVDIR64B instruction is used to atomically submit
64-byte work descriptors to devices. Although it can encounter errors
like device queue full, command not accepted, device not ready, etc when
writing to a device MMIO, MOVDIR64B can not report back on errors from
the device itself. This means that MOVDIR64B users need to separately
interact with a device to see if a descriptor was successfully queued,
which slows down device interactions.
ENQCMD and ENQCMDS also atomically submit 64-byte work descriptors
to devices. But, they *can* report back errors directly from the
device, such as if the device was busy, or device not enabled or does
not support the command. This immediate feedback from the submission
instruction itself reduces the number of interactions with the device
and can greatly increase efficiency.
ENQCMD can be used at any privilege level, but can effectively only
submit work on behalf of the current process. ENQCMDS is a ring0-only
instruction and can explicitly specify a process context instead of
being tied to the current process or needing to reprogram the IA32_PASID
MSR.
Use ENQCMDS for work submission within the kernel because a Process
Address ID (PASID) is setup to translate the kernel virtual address
space. This PASID is provided to ENQCMDS from the descriptor structure
submitted to the device and not retrieved from IA32_PASID MSR, which is
setup for the current user address space.
See Intel Software Developer’s Manual for more information on the
instructions.
[ bp:
- Make operand constraints like movdir64b() because both insns are
basically doing the same thing, more or less.
- Fixup comments and cleanup. ]
Link: https://lkml.kernel.org/r/20200924180041.34056-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20201005151126.657029-3-dave.jiang@intel.com
Add asm/mce.h to asm/asm-prototypes.h so that that asm symbol's checksum can be generated in order to support CONFIG_MODVERSIONS with it and fix: WARNING: modpost: EXPORT symbol "copy_mc_fragile" [vmlinux] version \ generation failed, symbol will not be versioned. For reference see: 4efca4e ("kbuild: modversions for EXPORT_SYMBOL() for asm") 334bb77 ("x86/kbuild: enable modversions for symbols exported from asm") Fixes: ec6347b ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()") Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201007111447.GA23257@zn.tnic
This reverts commit 353e228. Qian Cai reports that TX2 no longer boots with his .config as it appears that task_cpu() gets instrumented and used before KASAN has been initialised. Although Mark has a proposed fix, let's take the safe option of reverting this for now and sorting it out properly later. Link: https://lore.kernel.org/r/711bc57a314d8d646b41307008db2845b7537b3d.camel@redhat.com Reported-by: Qian Cai <cai@redhat.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
AMD Family 19h Models 20h-2Fh use the same PCI IDs as Family 17h Models 70h-7Fh. The same family ops and number of channels also apply. Use the Family17h Model 70h family_type and ops for Family 19h Models 20h-2Fh. Update the controller name to match the system. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201009171803.3214354-1-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
…rnel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "Support for a new TPM device and fixes and Git URL change (infraded -> korg)" * tag 'tpmdd-next-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: MAINTAINERS: TPM DEVICE DRIVER: Update GIT tpm_tis: Add a check for invalid status tpm: use %*ph to print small buffer dt-bindings: Add SynQucer TPM MMIO as a trivial device tpm: tis: add support for MMIO TPM on SynQuacer
…el/git/arm64/linux
Pull arm64 updates from Will Deacon:
"There's quite a lot of code here, but much of it is due to the
addition of a new PMU driver as well as some arm64-specific selftests
which is an area where we've traditionally been lagging a bit.
In terms of exciting features, this includes support for the Memory
Tagging Extension which narrowly missed 5.9, hopefully allowing
userspace to run with use-after-free detection in production on CPUs
that support it. Work is ongoing to integrate the feature with KASAN
for 5.11.
Another change that I'm excited about (assuming they get the hardware
right) is preparing the ASID allocator for sharing the CPU page-table
with the SMMU. Those changes will also come in via Joerg with the
IOMMU pull.
We do stray outside of our usual directories in a few places, mostly
due to core changes required by MTE. Although much of this has been
Acked, there were a couple of places where we unfortunately didn't get
any review feedback.
Other than that, we ran into a handful of minor conflicts in -next,
but nothing that should post any issues.
Summary:
- Userspace support for the Memory Tagging Extension introduced by
Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.
- Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
switching.
- Fix and subsequent rewrite of our Spectre mitigations, including
the addition of support for PR_SPEC_DISABLE_NOEXEC.
- Support for the Armv8.3 Pointer Authentication enhancements.
- Support for ASID pinning, which is required when sharing
page-tables with the SMMU.
- MM updates, including treating flush_tlb_fix_spurious_fault() as a
no-op.
- Perf/PMU driver updates, including addition of the ARM CMN PMU
driver and also support to handle CPU PMU IRQs as NMIs.
- Allow prefetchable PCI BARs to be exposed to userspace using normal
non-cacheable mappings.
- Implementation of ARCH_STACKWALK for unwinding.
- Improve reporting of unexpected kernel traps due to BPF JIT
failure.
- Improve robustness of user-visible HWCAP strings and their
corresponding numerical constants.
- Removal of TEXT_OFFSET.
- Removal of some unused functions, parameters and prototypes.
- Removal of MPIDR-based topology detection in favour of firmware
description.
- Cleanups to handling of SVE and FPSIMD register state in
preparation for potential future optimisation of handling across
syscalls.
- Cleanups to the SDEI driver in preparation for support in KVM.
- Miscellaneous cleanups and refactoring work"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
Revert "arm64: initialize per-cpu offsets earlier"
arm64: random: Remove no longer needed prototypes
arm64: initialize per-cpu offsets earlier
kselftest/arm64: Check mte tagged user address in kernel
kselftest/arm64: Verify KSM page merge for MTE pages
kselftest/arm64: Verify all different mmap MTE options
kselftest/arm64: Check forked child mte memory accessibility
kselftest/arm64: Verify mte tag inclusion via prctl
kselftest/arm64: Add utilities and a test to validate mte memory
perf: arm-cmn: Fix conversion specifiers for node type
perf: arm-cmn: Fix unsigned comparison to less than zero
arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
arm64: Get rid of arm64_ssbd_state
KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
KVM: arm64: Get rid of kvm_arm_have_ssbd()
KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
...
Pull Microblaze build warning fix from Michal Simek. * tag 'microblaze-v5.10' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: fix kbuild redundant file warning
…/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - Conversion of the Mac IDE driver to a platform driver - Minor cleanups and fixes * tag 'm68k-for-v5.10-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: ide/macide: Convert Mac IDE driver to platform driver m68k: Replace HTTP links with HTTPS ones m68k: mm: Remove superfluous memblock_alloc*() casts m68k: mm: Use PAGE_ALIGNED() helper m68k: Sort selects in main Kconfig m68k: amiga: Clean up Amiga hardware configuration m68k: Revive _TIF_* masks m68k: Correct some typos in comments m68k: Use get_kernel_nofault() in show_registers() zorro: Fix address space collision message with RAM expansion boards m68k: amiga: Fix Denise detection on OCS
…nux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Add Amazon's Annapurna Labs memory controller EDAC driver (Talel
Shenhar)
- New AMD CPUs support (Yazen Ghannam)
- The usual misc fixes and cleanups all over the subsystem
* tag 'edac_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/amd64: Set proper family type for Family 19h Models 20h-2Fh
EDAC/mc_sysfs: Add missing newlines when printing {max,dimm}_location
EDAC/aspeed: Use module_platform_driver() to simplify
EDAC, sb_edac: Simplify switch statement
EDAC/ti: Fix handling of platform_get_irq() error
EDAC/aspeed: Fix handling of platform_get_irq() error
EDAC/i5100: Fix error handling order in i5100_init_one()
EDAC/highbank: Handover Calxeda Highbank maintenance to Andre Przywara
EDAC/socfpga: Transfer SoCFPGA EDAC maintainership
EDAC/thunderx: Make symbol lmc_dfs_ents static
EDAC/al-mc-edac: Add Amazon's Annapurna Labs Memory Controller driver
dt-bindings: EDAC: Add Amazon's Annapurna Labs Memory Controller binding
EDAC/mce_amd: Add new error descriptions for existing types
EDAC: Replace HTTP links with HTTPS ones
…ux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- Extend the recovery from MCE in kernel space also to processes which
encounter an MCE in kernel space but while copying from user memory
by sending them a SIGBUS on return to user space and umapping the
faulty memory, by Tony Luck and Youquan Song.
- memcpy_mcsafe() rework by splitting the functionality into
copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
support for new hardware which can recover from a machine check
encountered during a fast string copy and makes that the default and
lets the older hardware which does not support that advance recovery,
opt in to use the old, fragile, slow variant, by Dan Williams.
- New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.
- Do not use MSR-tracing accessors in #MC context and flag any fault
while accessing MCA architectural MSRs as an architectural violation
with the hope that such hw/fw misdesigns are caught early during the
hw eval phase and they don't make it into production.
- Misc fixes, improvements and cleanups, as always.
* tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Allow for copy_mc_fragile symbol checksum to be generated
x86/mce: Decode a kernel instruction to determine if it is copying from user
x86/mce: Recover from poison found while copying from user space
x86/mce: Avoid tail copy when machine check terminated a copy from user
x86/mce: Add _ASM_EXTABLE_CPY for copy user access
x86/mce: Provide method to find out the type of an exception handler
x86/mce: Pass pointer to saved pt_regs to severity calculation routines
x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list
x86/mce: Add Skylake quirk for patrol scrub reported errors
RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE()
x86/mce: Annotate mce_rd/wrmsrl() with noinstr
x86/mce/dev-mcelog: Do not update kflags on AMD systems
x86/mce: Stop mce_reign() from re-computing severity for every CPU
x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
x86/mce: Increase maximum number of banks to 64
x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
RAS/CEC: Fix cec_init() prototype
…ernel/git/tip/tip Pull x86 cpu updates from Borislav Petkov: - Add support for hardware-enforced cache coherency on AMD which obviates the need to flush cachelines before changing the PTE encryption bit (Krish Sadhukhan) - Add Centaur initialization support for families >= 7 (Tony W Wang-oc) - Add a feature flag for, and expose TSX suspend load tracking feature to KVM (Cathy Zhang) - Emulate SLDT and STR so that windows programs don't crash on UMIP machines (Brendan Shanks and Ricardo Neri) - Use the new SERIALIZE insn on Intel hardware which supports it (Ricardo Neri) - Misc cleanups and fixes * tag 'x86_cpu_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM: SVM: Don't flush cache if hardware enforces cache coherency across encryption domains x86/mm/pat: Don't flush cache if hardware enforces cache coherency across encryption domnains x86/cpu: Add hardware-enforced cache coherency as a CPUID feature x86/cpu/centaur: Add Centaur family >=7 CPUs initialization support x86/cpu/centaur: Replace two-condition switch-case with an if statement x86/kvm: Expose TSX Suspend Load Tracking feature x86/cpufeatures: Enumerate TSX suspend load address tracking instructions x86/umip: Add emulation/spoofing for SLDT and STR instructions x86/cpu: Fix typos and improve the comments in sync_core() x86/cpu: Use XGETBV and XSETBV mnemonics in fpu/internal.h x86/cpu: Use SERIALIZE in sync_core() when available
…nux/kernel/git/tip/tip Pull x86 platform updates from Borislav Petkov: - Cleanup different aspects of the UV code and start adding support for the new UV5 class of systems (Mike Travis) - Use a flexible array for a dynamically sized struct uv_rtc_timer_head (Gustavo A. R. Silva) * tag 'x86_platform_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Update Copyrights to conform to HPE standards x86/platform/uv: Update for UV5 NMI MMR changes x86/platform/uv: Update UV5 TSC checking x86/platform/uv: Update node present counting x86/platform/uv: Update UV5 MMR references in UV GRU x86/platform/uv: Adjust GAM MMR references affected by UV5 updates x86/platform/uv: Update MMIOH references based on new UV5 MMRs x86/platform/uv: Add and decode Arch Type in UVsystab x86/platform/uv: Add UV5 direct references x86/platform/uv: Update UV MMRs for UV5 drivers/misc/sgi-xp: Adjust references in UV kernel modules x86/platform/uv: Remove SCIR MMR references for UV systems x86/platform/uv: Remove UV BAU TLB Shootdown Handler x86/uv/time: Use a flexible array in struct uv_rtc_timer_head
…kernel/git/tip/tip
Pull x86 PASID updates from Borislav Petkov:
"Initial support for sharing virtual addresses between the CPU and
devices which doesn't need pinning of pages for DMA anymore.
Add support for the command submission to devices using new x86
instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support
for process address space identifiers (PASIDs) which are referenced by
those command submission instructions along with the handling of the
PASID state on context switch as another extended state.
Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang"
* tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction
x86/asm: Carve out a generic movdir64b() helper for general usage
x86/mmu: Allocate/free a PASID
x86/cpufeatures: Mark ENQCMD as disabled when configured out
mm: Add a pasid member to struct mm_struct
x86/msr-index: Define an IA32_PASID MSR
x86/fpu/xstate: Add supervisor PASID state for ENQCMD
x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)
iommu/vt-d: Change flags type to unsigned int in binding mm
drm, iommu: Change type of pasid to u32
…kernel/git/tip/tip Pull misc x86 fixes fromm Borislav Petkov: - Ratelimit the message about writes to unrecognized MSRs so that they don't spam the console log (Chris Down) - Document how the /proc/cpuinfo machinery works for future reference (Kyung Min Park, Ricardo Neri and Dave Hansen) - Correct the current NMI's duration calculation (Libing Zhou) * tag 'x86_misc_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi: Fix nmi_handle() duration miscalculation Documentation/x86: Add documentation for /proc/cpuinfo feature flags x86/msr: Make source of unrecognised MSR writes unambiguous x86/msr: Prevent userspace MSR access from dominating the console
…nux/kernel/git/tip/tip Pull x86 fsgsbase updates from Borislav Petkov: "Misc minor cleanups and corrections to the fsgsbase code and respective selftests" * tag 'x86_fsgsbase_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86/fsgsbase: Test PTRACE_PEEKUSER for GSBASE with invalid LDT GS selftests/x86/fsgsbase: Reap a forgotten child x86/fsgsbase: Replace static_cpu_has() with boot_cpu_has() x86/entry/64: Correct the comment over SAVE_AND_SET_GSBASE
…ernel/git/tip/tip Pull x86 fpu updates from Borislav Petkov: - Allow clearcpuid= to accept multiple bits (Arvind Sankar) - Move clearcpuid= parameter handling earlier in the boot, away from the FPU init code and to a generic location (Mike Hommey) * tag 'x86_fpu_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Handle FPU-related and clearcpuid command line arguments earlier x86/fpu: Allow multiple bits in clearcpuid= parameter
…nux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "Misc minor cleanups" * tag 'x86_cleanups_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Fix typo in comments for syscall_enter_from_user_mode() x86/resctrl: Fix spelling in user-visible warning messages x86/entry/64: Do not include inst.h in calling.h x86/mpparse: Remove duplicate io_apic.h include
…/kernel/git/tip/tip
Pull x86 cache resource control updates from Borislav Petkov:
- Misc cleanups to the resctrl code in preparation for the ARM side
(James Morse)
- Add support for controlling per-thread memory bandwidth throttling
delay values on hw which supports it (Fenghua Yu)
* tag 'x86_cache_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Enable user to view thread or core throttling mode
x86/resctrl: Enumerate per-thread MBA controls
cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file
x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps
x86/resctrl: Merge AMD/Intel parse_bw() calls
x86/resctrl: Add struct rdt_membw::arch_needs_linear to explain AMD/Intel MBA difference
x86/resctrl: Use is_closid_match() in more places
x86/resctrl: Include pid.h
x86/resctrl: Use container_of() in delayed_work handlers
x86/resctrl: Fix stale comment
x86/resctrl: Remove struct rdt_membw::max_delay
x86/resctrl: Remove unused struct mbm_state::chunks_bw
…kernel/git/tip/tip Pull x86 fix from Borislav Petkov: "A single fix making the error message when the opcode bytes at rIP cannot be accessed during an oops, more precise" * tag 'x86_core_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/dumpstack: Fix misleading instruction pointer error message
pull bot
pushed a commit
that referenced
this pull request
Aug 26, 2022
…nsigned fw_level Though acpi_find_last_cache_level() always returned signed value and the document states it will return any errors caused by lack of a PPTT table, it never returned negative values before. Commit 0c80f9e ("ACPI: PPTT: Leave the table mapped for the runtime usage") however changed it by returning -ENOENT if no PPTT was found. The value returned from acpi_find_last_cache_level() is then assigned to unsigned fw_level. It will result in the number of cache leaves calculated incorrectly as a huge value which will then cause the following warning from __alloc_pages as the order would be great than MAX_ORDER because of incorrect and huge cache leaves value. | WARNING: CPU: 0 PID: 1 at mm/page_alloc.c:5407 __alloc_pages+0x74/0x314 | Modules linked in: | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-10393-g7c2a8d3ac4c0 #73 | pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : __alloc_pages+0x74/0x314 | lr : alloc_pages+0xe8/0x318 | Call trace: | __alloc_pages+0x74/0x314 | alloc_pages+0xe8/0x318 | kmalloc_order_trace+0x68/0x1dc | __kmalloc+0x240/0x338 | detect_cache_attributes+0xe0/0x56c | update_siblings_masks+0x38/0x284 | store_cpu_topology+0x78/0x84 | smp_prepare_cpus+0x48/0x134 | kernel_init_freeable+0xc4/0x14c | kernel_init+0x2c/0x1b4 | ret_from_fork+0x10/0x20 Fix the same by changing fw_level to be signed integer and return the error from init_cache_level() early in case of error. Reported-and-Tested-by: Bruno Goncalves <bgoncalv@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20220808084640.3165368-1-sudeep.holla@arm.com Signed-off-by: Will Deacon <will@kernel.org>
pull bot
pushed a commit
that referenced
this pull request
Oct 11, 2022
This is a BUG_ON issue as follows when running xfstest-generic-503: WARNING: CPU: 21 PID: 1385 at fs/f2fs/inode.c:762 f2fs_evict_inode+0x847/0xaa0 Modules linked in: CPU: 21 PID: 1385 Comm: umount Not tainted 5.19.0-rc5+ #73 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 Call Trace: evict+0x129/0x2d0 dispose_list+0x4f/0xb0 evict_inodes+0x204/0x230 generic_shutdown_super+0x5b/0x1e0 kill_block_super+0x29/0x80 kill_f2fs_super+0xe6/0x140 deactivate_locked_super+0x44/0xc0 deactivate_super+0x79/0x90 cleanup_mnt+0x114/0x1a0 __cleanup_mnt+0x16/0x20 task_work_run+0x98/0x100 exit_to_user_mode_prepare+0x3d0/0x3e0 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Function flow analysis when BUG occurs: f2fs_fallocate mmap do_page_fault pte_spinlock // ---lock_pte do_wp_page wp_page_shared pte_unmap_unlock // unlock_pte do_page_mkwrite f2fs_vm_page_mkwrite down_read(invalidate_lock) lock_page if (PageMappedToDisk(page)) goto out; // set_page_dirty --NOT RUN out: up_read(invalidate_lock); finish_mkwrite_fault // unlock_pte f2fs_collapse_range down_write(i_mmap_sem) truncate_pagecache unmap_mapping_pages i_mmap_lock_write // down_write(i_mmap_rwsem) ...... zap_pte_range pte_offset_map_lock // ---lock_pte set_page_dirty f2fs_dirty_data_folio if (!folio_test_dirty(folio)) { fault_dirty_shared_page set_page_dirty f2fs_dirty_data_folio if (!folio_test_dirty(folio)) { filemap_dirty_folio f2fs_update_dirty_folio // ++ } unlock_page filemap_dirty_folio f2fs_update_dirty_folio // page count++ } pte_unmap_unlock // --unlock_pte i_mmap_unlock_write // up_write(i_mmap_rwsem) truncate_inode_pages up_write(i_mmap_sem) When race happens between mmap-do_page_fault-wp_page_shared and fallocate-truncate_pagecache-zap_pte_range, the zap_pte_range calls function set_page_dirty without page lock. Besides, though truncate_pagecache has immap and pte lock, wp_page_shared calls fault_dirty_shared_page without any. In this case, two threads race in f2fs_dirty_data_folio function. Page is set to dirty only ONCE, but the count is added TWICE by calling filemap_dirty_folio. Thus the count of dirty page cannot accord with the real dirty pages. Following is the solution to in case of race happens without any lock. Since folio_test_set_dirty in filemap_dirty_folio is atomic, judge return value will not be at risk of race. Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
pull bot
pushed a commit
that referenced
this pull request
Nov 10, 2023
This command: $ perf record -e cycles:k -e instructions:k -c 10000 -m 64M dd if=/dev/zero of=/dev/null count=1000 gives rise to this kernel warning: [ 444.364395] WARNING: CPU: 0 PID: 104 at kernel/smp.c:775 smp_call_function_many_cond+0x42c/0x436 [ 444.364515] Modules linked in: [ 444.364657] CPU: 0 PID: 104 Comm: perf-exec Not tainted 6.6.0-rc6-00051-g391df82e8ec3-dirty #73 [ 444.364771] Hardware name: riscv-virtio,qemu (DT) [ 444.364868] epc : smp_call_function_many_cond+0x42c/0x436 [ 444.364917] ra : on_each_cpu_cond_mask+0x20/0x32 [ 444.364948] epc : ffffffff8009f9e0 ra : ffffffff8009fa5a sp : ff20000000003800 [ 444.364966] gp : ffffffff81500aa0 tp : ff60000002b83000 t0 : ff200000000038c0 [ 444.364982] t1 : ffffffff815021f0 t2 : 000000000000001f s0 : ff200000000038b0 [ 444.364998] s1 : ff60000002c54d98 a0 : ff60000002a73940 a1 : 0000000000000000 [ 444.365013] a2 : 0000000000000000 a3 : 0000000000000003 a4 : 0000000000000100 [ 444.365029] a5 : 0000000000010100 a6 : 0000000000f00000 a7 : 0000000000000000 [ 444.365044] s2 : 0000000000000000 s3 : ffffffffffffffff s4 : ff60000002c54d98 [ 444.365060] s5 : ffffffff81539610 s6 : ffffffff80c20c48 s7 : 0000000000000000 [ 444.365075] s8 : 0000000000000000 s9 : 0000000000000001 s10: 0000000000000001 [ 444.365090] s11: ffffffff80099394 t3 : 0000000000000003 t4 : 00000000eac0c6e6 [ 444.365104] t5 : 0000000400000000 t6 : ff60000002e010d0 [ 444.365120] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 444.365226] [<ffffffff8009f9e0>] smp_call_function_many_cond+0x42c/0x436 [ 444.365295] [<ffffffff8009fa5a>] on_each_cpu_cond_mask+0x20/0x32 [ 444.365311] [<ffffffff806e90dc>] pmu_sbi_ctr_start+0x7a/0xaa [ 444.365327] [<ffffffff806e880c>] riscv_pmu_start+0x48/0x66 [ 444.365339] [<ffffffff8012111a>] perf_adjust_freq_unthr_context+0x196/0x1ac [ 444.365356] [<ffffffff801237aa>] perf_event_task_tick+0x78/0x8c [ 444.365368] [<ffffffff8003faf4>] scheduler_tick+0xe6/0x25e [ 444.365383] [<ffffffff8008a042>] update_process_times+0x80/0x96 [ 444.365398] [<ffffffff800991ec>] tick_sched_handle+0x26/0x52 [ 444.365410] [<ffffffff800993e4>] tick_sched_timer+0x50/0x98 [ 444.365422] [<ffffffff8008a6aa>] __hrtimer_run_queues+0x126/0x18a [ 444.365433] [<ffffffff8008b350>] hrtimer_interrupt+0xce/0x1da [ 444.365444] [<ffffffff806cdc60>] riscv_timer_interrupt+0x30/0x3a [ 444.365457] [<ffffffff8006afa6>] handle_percpu_devid_irq+0x80/0x114 [ 444.365470] [<ffffffff80065b82>] generic_handle_domain_irq+0x1c/0x2a [ 444.365483] [<ffffffff8045faec>] riscv_intc_irq+0x2e/0x46 [ 444.365497] [<ffffffff808a9c62>] handle_riscv_irq+0x4a/0x74 [ 444.365521] [<ffffffff808aa760>] do_irq+0x7c/0x7e [ 444.365796] ---[ end trace 0000000000000000 ]--- That's because the fix in commit 3fec323 ("drivers: perf: Fix panic in riscv SBI mmap support") was wrong since there is no need to broadcast to other cpus when starting a counter, that's only needed in mmap when the counters could have already been started on other cpus, so simply remove this broadcast. Fixes: 3fec323 ("drivers: perf: Fix panic in riscv SBI mmap support") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Clément Léger <cleger@rivosinc.com> Tested-by: Yu Chien Peter Lin <peterlin@andestech.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> #On Link: https://lore.kernel.org/r/20231026084010.11888-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
pull bot
pushed a commit
that referenced
this pull request
Jan 27, 2025
If a timerlat tracer is started with the osnoise option OSNOISE_WORKLOAD disabled, but then that option is enabled and timerlat is removed, the tracepoints that were enabled on timerlat registration do not get disabled. If the option is disabled again and timelat is started, then it triggers a warning in the tracepoint code due to registering the tracepoint again without ever disabling it. Do not use the same user space defined options to know to disable the tracepoints when timerlat is removed. Instead, set a global flag when it is enabled and use that flag to know to disable the events. ~# echo NO_OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options ~# echo timerlat > /sys/kernel/tracing/current_tracer ~# echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options ~# echo nop > /sys/kernel/tracing/current_tracer ~# echo NO_OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options ~# echo timerlat > /sys/kernel/tracing/current_tracer Triggers: ------------[ cut here ]------------ WARNING: CPU: 6 PID: 1337 at kernel/tracepoint.c:294 tracepoint_add_func+0x3b6/0x3f0 Modules linked in: CPU: 6 UID: 0 PID: 1337 Comm: rtla Not tainted 6.13.0-rc4-test-00018-ga867c441128e-dirty #73 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:tracepoint_add_func+0x3b6/0x3f0 Code: 48 8b 53 28 48 8b 73 20 4c 89 04 24 e8 23 59 11 00 4c 8b 04 24 e9 36 fe ff ff 0f 0b b8 ea ff ff ff 45 84 e4 0f 84 68 fe ff ff <0f> 0b e9 61 fe ff ff 48 8b 7b 18 48 85 ff 0f 84 4f ff ff ff 49 8b RSP: 0018:ffffb9b003a87ca0 EFLAGS: 00010202 RAX: 00000000ffffffef RBX: ffffffff92f30860 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9bf59e91ccd0 RDI: ffffffff913b6410 RBP: 000000000000000a R08: 00000000000005c7 R09: 0000000000000002 R10: ffffb9b003a87ce0 R11: 0000000000000002 R12: 0000000000000001 R13: ffffb9b003a87ce0 R14: ffffffffffffffef R15: 0000000000000008 FS: 00007fce81209240(0000) GS:ffff9bf6fdd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055e99b728000 CR3: 00000001277c0002 CR4: 0000000000172ef0 Call Trace: <TASK> ? __warn.cold+0xb7/0x14d ? tracepoint_add_func+0x3b6/0x3f0 ? report_bug+0xea/0x170 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __pfx_trace_sched_migrate_callback+0x10/0x10 ? tracepoint_add_func+0x3b6/0x3f0 ? __pfx_trace_sched_migrate_callback+0x10/0x10 ? __pfx_trace_sched_migrate_callback+0x10/0x10 tracepoint_probe_register+0x78/0xb0 ? __pfx_trace_sched_migrate_callback+0x10/0x10 osnoise_workload_start+0x2b5/0x370 timerlat_tracer_init+0x76/0x1b0 tracing_set_tracer+0x244/0x400 tracing_set_trace_write+0xa0/0xe0 vfs_write+0xfc/0x570 ? do_sys_openat2+0x9c/0xe0 ksys_write+0x72/0xf0 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tomas Glozar <tglozar@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: John Kacur <jkacur@redhat.com> Link: https://lore.kernel.org/20250123204159.4450c88e@gandalf.local.home Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
pull bot
pushed a commit
that referenced
this pull request
Jan 25, 2026
Revert commit bfc467d ("serial: remove redundant tty_port_link_device()") because the tty_port_link_device() is not redundant: the tty->port has to be confured before we call uart_configure_port(), otherwise user-space can open console without TTY linked to the driver. This tty_port_link_device() was added explicitly to avoid this exact issue in commit fb2b900 ("tty: link tty and port before configuring it as console"), so offending commit basically reverted the fix saying it is redundant without addressing the actual race condition presented there. Reproducible always as tty->port warning on Qualcomm SoC with most of devices disabled, so with very fast boot, and one serial device being the console: printk: legacy console [ttyMSM0] enabled printk: legacy console [ttyMSM0] enabled printk: legacy bootconsole [qcom_geni0] disabled printk: legacy bootconsole [qcom_geni0] disabled ------------[ cut here ]------------ tty_init_dev: ttyMSM driver does not set tty->port. This would crash the kernel. Fix the driver! WARNING: drivers/tty/tty_io.c:1414 at tty_init_dev.part.0+0x228/0x25c, CPU#2: systemd/1 Modules linked in: socinfo tcsrcc_eliza gcc_eliza sm3_ce fuse ipv6 CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G S 6.19.0-rc4-next-20260108-00024-g2202f4d30aa8 #73 PREEMPT Tainted: [S]=CPU_OUT_OF_SPEC Hardware name: Qualcomm Technologies, Inc. Eliza (DT) ... tty_init_dev.part.0 (drivers/tty/tty_io.c:1414 (discriminator 11)) (P) tty_open (arch/arm64/include/asm/atomic_ll_sc.h:95 (discriminator 3) drivers/tty/tty_io.c:2073 (discriminator 3) drivers/tty/tty_io.c:2120 (discriminator 3)) chrdev_open (fs/char_dev.c:411) do_dentry_open (fs/open.c:962) vfs_open (fs/open.c:1094) do_open (fs/namei.c:4634) path_openat (fs/namei.c:4793) do_filp_open (fs/namei.c:4820) do_sys_openat2 (fs/open.c:1391 (discriminator 3)) ... Starting Network Name Resolution... Apparently the flow with this small Yocto-based ramdisk user-space is: driver (qcom_geni_serial.c): user-space: ============================ =========== qcom_geni_serial_probe() uart_add_one_port() serial_core_register_port() serial_core_add_one_port() uart_configure_port() register_console() | | open console | ... | tty_init_dev() | driver->ports[idx] is NULL | tty_port_register_device_attr_serdev() tty_port_link_device() <- set driver->ports[idx] Fixes: bfc467d ("serial: remove redundant tty_port_link_device()") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Link: https://patch.msgid.link/20260123072139.53293-2-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]. Want to support this open source service? Please star it : )